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Agrobacterium radbbacter is the only known non-phytopathogenic species in Agrobacterium 
genus. In this study, the whole-genome sequence of A. radiobacter type strain DSM 30147^ 
was described and compared to the other available Agrobacterium genomes. This bacterium 
has a genome size of 7,122,065 bp distributed in 612 contigs, including 6,834 protein- 
coding genes and 41 RNA genes. It harbors a circular chromosome and a linear chromosome 
but not a tumor-inducing (Ti) plasmid. To the best of our knowledge, this is the first report of 
a genome from the A. radiobacter species. In addition, an emended description of A. 
radiobacter is described. This study reveals information that enhances the current understand- 
ing of its non-phytopathpgenicity and its phylogenetic position within Agrobacterium genus. 



Introduction 




A taxonomic classification that relies on the 
phytopathogenic phenotypes may not accurately 
reflect the actual phylogenetic relationships of 
strains within Agrobacterium [10]. Accordingly, an 
alternative classification method was applied 
which divided most Agrobacterium strains into 3 
biovariants (Biovars I, II and III) [10]. Among the 3 
biovariants, Biovar I is the most complex group 
and includes several members (genomovars), des- 
ignated as genomovar Gl through G9 and G13 
[8,11]. At present, two strains in Biovar I have 
been completely sequenced: Agrobacterium sp. 
H13-3 (Gl) and A. tumefaciens CSS (G8). The ge- 
nome sequencing revealed that these strains con- 
tained two chromosomes and different numbers 
of plasmids. A. radiobacter DSM 30147T also be- 
longs to Biovar I (it is classified as a member of 
genomovar G4), which indicates its close relation- 
ship to A. tumefaciens C58 and Agrobacterium sp. 
H13-3 [12]. 
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Most strains in the genus Agrobacterium are 
phytopathogens and induce crown gall tumors or 
hairy root diseases in their host plants [2]. How- 
ever, A. radiobacter is an exception because it 
does not have the tumor-inducing (Ti) plasmid 
that contributes to the pathogenicity [13-16]. A. 
radiobacter members have been widely found in 
soil, in the rhizosphere of plants and in clinical 
specimens [17]. A strain of A. radiobacter was 
reported to enhance soil arsenic phytoremedia- 
tion, indicating a potential application in biore- 
mediation [18]. However, some members have 
been identified as opportunistic human patho- 
gens [19]. So far, a total of 11 Agrobacterium ge- 
nomes (3 finished and 8 draft genomes, listed in 
Table 1) have been sequenced but no genome of 
A. radiobacter has been reported. Considering its 
essential biological feature and important phylo- 
genetic position in the genus Agrobacterium, we 
present the genome sequence of A. radiobacter 
DSM 3014 7T, the first sequenced strain in this 
species. 

The descriptions of A. radiobacter have been re- 
ported in 1902 [1], 1942 [2], 1980 [21] and 1993 
[22]. After that, fatty acids and utilization of more 
carbon and nitrogen sources have been tested and 
showed that the major fatty acids (> 5%) are 16:0, 
19:0 cyclo ojSc, summed feature 2 [one or more of 
12:0 aldehyde, iso-16:l I and 14:0 3-OH) and 
summed feature 8 [18:lo}7c and/or 18:l(x}6c] 
[23]. The strain can utilize adonitol, D-fructose, D- 
galactose, D-mannitol, lactose and raffinose as sole 
carbon sources and L-ornithine, L-proline and L- 
serine as sole nitrogen sources [23]. Citrate utili- 
zation, nitrate reduction and urease are all posi- 
tive [23]. In this study, we performed more physi- 
ological/biochemical analysis and present the 
emended description of A. radiobacter. 

Classification and features 

Genome sequences and 16 S rRNA genes were 
used for phylogenetic analysis. In view of the 
close evolutionary relationship and the incon- 
sistent phylogeny between Agrobacterium and 
Rhizobium [12], we pre-analyzed all sequenced 
strains in these two genera and found that two 
"Rhizobium" members were very closely related 
to the 12 Agrobacterium members [including 
strain DSM 30147T). Thus, all of the 12 Agrobac- 
terium members with sequenced genomes, two 
Rhizobium strains [R. lupini HPC[L3 and Rhizobi- 
um sp. PDOl-076] [Table 1) and an out-group 
http://standardsingenomics.org 



strain R. rhizogenes K84 [7,8], were included in 
the phylogenetic analysis. A comparison of the 15 
genomes revealed a total of 370 proteins that 
were shared across these genomes. A rooted 
neighbor-jointing [NJ) phylogenetic tree was 
constructed based on the shared amino acid se- 
quences. As shown in Figure la, A. radiobacter 
DSM 30147T was in the same cluster as the 
Biovar I members Agrobacterium sp. H13-3 [Gl) 
and A. tumefaciens C58 [G8), and showed the 
closest relationship with A. tumefaciens str. Cher- 
ry 2E-2-2. A NJ phylogenetic tree was also con- 
structed based on the 16S rRNA genes [Figure 
lb). When comparing the trees generated by the 
core protein sequences with those generated by 
16S rRNA gene sequences, small topological dif- 
ferences in topology were found between them. 
In comparison to the tree generated using the 
370 conserved proteins, some strains could not 
be distinguished with a high degree of clarity us- 
ing the 16S rRNA genes. Therefore, phylogenomic 
analysis was considered a more robust approach 
than that using the 16S rRNA genes to infer the 
phylogeny, especially for closely related strains 
[21,25,26]. 

Strain DSM 30147T is rod-shaped [0.6-0.8 x 1.5- 
1.8 |im] [Figure 2). The enzyme activities and car- 
bon sources utilization of strain DSM 301471" were 
tested using API ZYM, API 20 NE and API ID 32 GN 
systems and the results are shown in Table 2 and 
in the emended description of A. radiobacter. 

Genome sequencing and annotation 

Genome project history 

To make a comprehensive genomic comparison 
for the Agrobacterium genomes, the whole ge- 
nome sequence of A. radiobacter DSM 30147T was 
determined. This draft genome sequence has been 
deposited at DDBJ/EMBL/GenBank under acces- 
sion number ASXYOOOOOOOO. The version de- 
scribed in this study is the first version, 
ASXYOIOOOOOO. The project information is sum- 
marized in Table 3. 

Growth condition and DNA isolation 

A. radiobacter DSM 30147T was grown aerobically 
in LB medium [38] at 28 °C for 24 h. The DNA was 
extracted, concentrated and purified using the 
QiAamp kit according to the manufacturer's in- 
struction [Qiagen, Germany]. 
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Genome sequencing and assembly 

lUumina Hiseq2000 with the Paired-End hbrary 
strategy (300 bp insert size] was used to deter- 
mine the whole-genome sequence of A. 
radiobacter DSM 30147T and obtained a total of 
15,140,909 reads (1.41 Gb data). The detailed 



methods of library construction and sequencing 
can be found at Illumina's official website [39]. 
Using SOAPdenovo vl.05 [40], these reads were 
assembled into 612 contigs (> 200 bp] with a ge- 
nome size of 7,122,065 bp and an average cover- 
age of 196.3 X. 



Table 1. General information and comparison of the 14 /4grobacter/t/m-related genomes (12 Agrobacterium strains 
and 2 Rhizobium strains) 



Strain 


Isolation source 


Genome 

size fiVtb) CDSs# Unique 


gene #GenBank No. 


A. radbbacter DSM 30147- 


Soil 


7.18 


6,834 


548ASXY00000000 


A. tumefaciens str. Cherry2E-2-2 


Crown gall infected cherry root- 
stalk 


5.43 


5,040 


482APCC00000000 


A. tumefaciens CCNWGS0286 


Zinc-lead mine tailing 


5.21 


4,979 


489AGSM00000000 


A. albertimagni AOl^ 5 


Hot Creek 


5.09 


4,811 


734ALjF00000000 


Agrobacterium sp. 224MFTsu3.1 


Plant-associated 


4.80 


4,593 


141ARQL00000000 


R. lupini ViPCil) 


Saline desert soil 


5.27 


4,614 


554AMQQ00000000 


Agrobacterium sp. ATCC 31 749 


Non plant-associated 


5.46 


5,529 


984AECL00000000 


A ti imPT^r^ if^ HQ 

/I . LUI 1 IC la\^ IC 1 Ij I Z. 


Snil 


5.47 


5,2 88 




A. tumefaciens 5 A 


Arsenic-enriched caiciaquoii soil 


5.74 


5,517 


539AGVZ00000000 


Agrobacterium sp. lOMFColl.1 


Rhizosphere 


5.44 


5,280 


241ARLj00000000 


Agrobacterium sp. HI 3-3 


Rhizosphere of Lupinus luteus 


5.57 


5,345 


1,314GCA_000192635 


A. vitis S4 


Vtis vinfera 


6.32 


5,389 


870GCA_000016285 


Rhizobium sp. PDO1-076 


Root of Populus deltoids 


5.51 


5,347 


873AHZC00000000 


A. tumefaciens C58 


Cherry tree tumor 


5.67 


5,355 


196GCA_000092025 



" Genomes were annotated through the RAST system [20] 
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Figure 1. Phylogenetic trees highlighting the relationships among A. radiobacter DSM 30147' and other closely related se- 
quenced strains, (a) A tree was built based on 370 conserved proteins shared among the 15 genomes (12 Agrobacterium 
strains, 2 Rhizobium strains very closely related to Agrobacterium and one out-group strain, R. rhizo^nes K84); (b) A tree 
inferred from the 16S rRNA genes of the same strains. The phylogenies were inferred by MEGA 5.05 using the neighbor- 
joining algorithm [20,24], and 1,000 bootstrap repetitions were computed to estimate the reliability of the branching order. 
The genome accession numbers of the strains used in the phylogenetic reconstructions: A. albeitimagni AOL15, 
ALJFOOOOOOOO; Rhizobium sp. PDOl-076, AHZCOOOOOOOO; A. vitis S4, A. radiobacter, ASXY01 000000; GCA_00001 6285,• 
/\g/•o6acfe/•/um sp. H13-3, GCA_0001 92535; Agrobacterium sp. lOMFColl.1, ARLJOOOOOOOO; A. tumefaciens 5A, 
AGVZOOOOOOOO; A. tumefaciens F2, AFSDOOOOOOOO; A. tumefaciens C58, GCA_000092 02 5; Agrobacterium sp. ATCC 
31749, AECLOOOOOOOO; R. /up;h; HPC(L), AMQQOOOOOOOO; /\. tumefaciens str. Cherry 2 E-2-2, APCCOOOOOOOO; /Igrobac- 
terium sp. 224MFTsu3.1, ARQLOOOOOOOO; A. tumefaciens CCNWGS0286, AGSMOOOOOOOO and R. rhizogenes K84 
GCA 0000162 65. 





1 fim 



Figure 2. A transmission micrograph of A. radiobacter DSM 30147^, using 200 kV trans- 
mission electron microscopy FEI Tecnai G^ 20 TWIN (USA). The scale bar represents 1 
\im. 
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Table 2. Classification and general features of Agrobacterium radiobacter DSM 30147^ according to the MIGS rec- 
ommendations 12 7,2 8] 



MIGS ID Property 



Term 



Evidence code 



MlGS-6.2 
MIGS-22 
MIGS-15 

MlGS-14 



Current classification 



Gram stain 
Cell shape 
Motility 
Sporulation 
Optimum temperature 



Carbon source 



Energy source 

Terminal electron receptor 

pH 

Oxygen 

Biotic relationship 

Pathogenicity 
Biosafety level 



MIGS-4 Geographic location 

MIGS-5 Sample collection time 

MIGS-4.1 Latitude 

MlGS-4.2 Longitude 

MlGS-4.3 Depth 

MIGS-4.4 Altitude 



Domain Bacteria 

Phylum Proteobacteria 

Class Alphaproteobacteria 

Order Rhizobiales 

Family Rhizobiaceae 

Genus Agrobacterium 

Species Agrobacterium radiobacter 

type strain DSM 30147^ 

negative 
rod -shaped 
motile 

non-sporulating 
25-28 °C 

arabinose, D-glucose, D-melibiose, D-ribose, D- 
sorbitol, gluconate, histidine, 
4-hydroxybenzoate, 3-hydroxybutyrate, inositol, 
2-ketogluconate, 

L-alanine, L-fucose, L-lactate, L-proline, L- 
rhamnose, malate, maltose, mannitol, 
mannose, N-acetyl glucosamine, propionate, 
salicin, sodium acetate and sucrose 

chemoorganotroph 
molecular oxygen 
6-7 

aerobic 
free-living 

non-phytopathogenic 

level 1, in individual cases, some members of 
this species are suspected human pathogens 

not reported 
1902 

Not reported 
Not Reported 
not reported 

not reported 



TAS [29] 

TAS [28] 

TAS [30,31] 

TAS [30,32] 

TAS [21,33] 

TAS [2,21,22,33-35] 

TAS [21,22,33] 

TAS [1-3] 

TAS [22] 

TAS [22] 

IDA 

TAS [22] 
TAS [22] 



IDA 

TAS [22] 
TAS [22] 
TAS [22] 
TAS [22] 
NAS 

TAS [36] 



TAS [1] 
TAS [1] 
TAS [1] 



Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement; NAS: Non-traceable Author 
Statement. These evidence codes are from the Gene Ontology project [37]. If the evidence is IDA, then the property 
was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements. 
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Table 3. Project information 



MIGS ID Property 



Term 



MIGS-31 Finishing quahty 

MIGS-28 Libraries used 

MIGS-29 Sequencing platform lllumina Hiseq2000 

MIGS-31. 2 Sequencing coverage 196.3 x 

MIGS-30 Assemblers 



High-quality draft 

lllumina Paired-End library (300 bp insert size) 



MIGS-32 



MIGS-13 



SOAPdenovo v1.05 
Gene calling method GeneMarkS^ 
GenBank date of release July 12, 2013 
NCBI project ID ASXYOOOOOOOO 
Source material identifier DSM 30147^ 



Project relevance 



Genome comparison 



Genome annotation 

The draft genome of A. mdiobacter DSM 30147T 
was annotated using the National Center for Bio- 
technology Information (NCBI] Prokaryotic Ge- 
nome Annotation Pipeline (PGAP) [41], which 
combines the gene caller GeneMarkS+ [42] with 
the similarity-based gene detection approach. Pro- 
tein function classification was performed by 
searching all the predicted coding sequences of 
strain DSM 30147T against the Clusters of Orthol- 
ogous Groups (COGs) protein database [43] using 
Blastp algorithm with E-value cutoff l-ei". 

Genome properties 

The whole genome of A. radiobacter DSM 30147T 
is 7,122,065 bp in length, with an average GC con- 
tent of 59.9%, and distributed in 612 contigs. 
Compared to the complete reference genome A. 
tumefaciens C58 [44] (also belonging to Biovar 
I,Figure 1), the whole genome of strain DSM 
30147T could clearly be divided into 2 replicons, a 
circular chromosome and a linear chromosome 
(Figure 3). In accordance with its non- 
phyto pathogenicity phenotype, strain DSM 
30147T did not contain a Ti plasmid. Of the 6,894 
genes predicted, 6,853 were protein-coding genes 
(CDSs), and 41 RNA genes. A total of 5,320 CDSs 



(77.85%) were assigned with putative functions, 
and the remaining proteins were annotated as the 
hypothetical proteins. The genome properties and 
statistics are summarized in Table 4 and Figure 3. 
The distribution of the genes into COG functional 
categories is shown in Table 5. 

Comparative genome analysis of A, 
radiobacter DSM 30147^ with the other 
related genomes 

Strain DSM 30147T has the largest genome size of 
the 12 Agrobacterium strains sequenced to date 
and is larger than the 2 very closely related Rhizo- 
bium strain genomes as well (Table 1). OrthoMCL 
[45] was used to perform orthologs clustering 
analysis for the 14 genomes (Table 1). The results 
indicate that A. radiobacter DSM 30147T shares 
1,636 genes with the other 13 strains and contains 
548 strain- specific genes (Table 1), which poten- 
tially encode products that contribute to species- 
specific features differentiating A. radiobacter 
from other Agrobacterium species [46]. In addi- 
tion, on average, only 31% core genes were 
shared among the 14 genomes, which reveals a 
high-degree of diversity within Agrobacterium ge- 
nus. 
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Figure 3. The circular representation of the A. radbbacter DSM 30147^ circular chromosome (left) and linear chro- 
mosome (right). From outside to center, ring 1, 4 show protein-coding genes colored by COG categories on for- 
ward/reverse strand; ring 2, 3 denote genes on forward/ re verse strand; ring 5 shows G+C% content plot, and the in- 
nermost ring shows GC skew. 



Table 4. Genome statistics 



Attribute 


Value 


% of Total 


Genome size (bp) 


7,177,085 


100 


Number of contigs 


612 




Contig N50 


24,130 




Number of replicons 


2 




Extrachromosomal elements 


Unknown 




Total genes 


7,151 


100 


Protein-coding genes 


6,834 


95.57 


Pseudo genes 


2 76 


3.86 


RNA genes 


41 


0.57 


rRNAs 


4 




Frameshifted genes 


95 




DNA coding region (bp) 


6,197,065 


86.34 


Protein-coding genes with function prediction 


5,320 


77.85 


Protein-coding genes assigned to COGs 


5,333 


78.04 


Protein-coding genes with conserved domain 


5,986 


87.59 


Protein-coding genes with transmembrane helices 


1,899 


2 7.79 


Protein-coding genes with signal peptides 


550 


8.05 
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Table 5. Number of protein-coding genes associated with the general COG functional cate- 
gories \n A. radiobacter DSM 301 47^ genome 



Code 


Value 


Sage 


Description 


J 


184 


2.69 


Translation, ribosomal structure and biogenesis 


A 


0 


0.00 


RNA processing and modification 


K 


461 


6.75 


Transcription 


L 


157 


2.30 


Replication, recombination and repair 


B 


0 


0.00 


Chromatin structure and dynamics 


D 


39 


0.57 


Cell cycle control, cell division, chromosome partitioning 


Y 


0 


0.00 


Nuclear structure 


V 


75 


1.10 


Defense mechanisms 


T 


284 


4.16 


Signal transduction mechanisms 


M 


282 


4.13 


Cell wall/membrane/envelope biogenesis 


N 


99 


1.45 


Cell motility 


Z 


0 


0.00 


Cytoskeleton 


w 


0 


0.00 


Extracellular structures 


u 


100 


1.46 


Intracellulartrafficking, secretion, and vesicular transport 


o 


197 


2.88 


Posttranslational modification, protein turnover, chaperones 


c 


336 


4.92 


Energy production and conversion 


G 


585 


8.56 


Carbohydrate transport and metabolism 


E 


757 


11.08 


Amino acid transport and metabolism 


F 


115 


1.68 


Nucleotide transport and metabolism 


H 


224 


3.28 


Coenzyme transport and metabolism 


1 


188 


2.75 


Lipid transport and metabolism 


P 


481 


7.04 


Inorganic ion transport and metabolism 


Q 


148 


2.17 


Secondary metabolites biosynthesis, transport and catabolism 


R 


684 


10.01 


General function prediction only 


S 


546 


7.99 


Function unknown 




1501 


21.96 


Not in COGs 
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Emended description of Agrobacterium 
radiobacter (Beijerinck and van Delden 1902) 
Conn 1942 (Approved Lists 1980) emend, 
Sawada efa/. 1993 

This emended description is based on that given 
by Beijerinck and van Delden 1902, Conn 1942 
(Approved Lists 1980) and Sawada et al. 1993 
with the following changes. Positive results are 
observed for acid phosphatase, a-glucosidase, al- 
kaline phosphatase, arginine dihydrolase, (B- 
glucosidase, citrate utilization, esterase (C4), 
leucine arylamidase, N-acetyl-(B-glucosaminidase, 
naphthol-AS-BI-phosphohydrolase, nitrate reduc- 
tion, urease and valine arylamidase, but negative 
results for a-galactosidase, a-mannosidase, (B- 
fucosidase, (B-galactosidase, (B- glucuronidase, 
chymotiypsin, cystine arylamidase, esterase lipase 
(C8), lipase (C14) and trypsin. Arabinose, D- 
glucose, D-melibiose, D-ribose, D-sorbitol, 
gluconates, histidine, 4-hydroxybenzoate, 3- 
hydroxybutyrate, inositol, 2-ketogluconate, L- 
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